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The inclusion of English Language Learners as a subgroup in the No Child Left Behind legisla- 
tion has leant additional importance to the need for valid and efficient measures of reading for 
students whose first language is not English. This study examines the use of Curriculum-Based 
Measurement (CBM) reading fluency as a predictor of later reading perfonnance on state ac- 
countability tests for fifth grade ELL students. The findings of this study indicate that CBM is 
a significant predictor of later performance on tests for accountability for ELL students as a 
whole, and for the individual language groups of Spanish, Hmong, and Somali. Implications for 
these findings are discussed. 

One of the greatest contemporary opportunities and challenges in America is the education of cul- 
turally and linguistically diverse students whose first language is not English. The general term, English 
Language Learner (ELL), is used to describe a group of students who are non-native English speakers 
and who score low on a measure of English language proficiency. The No Child Left Behind Act (NCLB, 
2001) refers to this group as students with limited English proficiency, and defines them as students who 
belong to one of the following categories: 

a) Was not born in the United States or speaks a native language other than English; 

b) Is a Native American, Alaska Native, or native resident of outlying areas and comes from an envi- 
ronment where language other than English has had a significant impact in the individual’s level 
of English language proficient, or 

c) Is migratory, speaks a native language other than English, and comes from an environment where 
language other than English is dominant, or 

d) May be unable, because of difficulties in speaking, reading, writing, or understanding the English 
language, to score at the proficient level on state assessments of academic achievement, learn 
successfully in classrooms where the language of instruction is English, or participate fully in 
society. 

The Number and Achievement of ELL Students 

Estimates during the 1990s indicated there was an increase of about 1 million ELL students, which 
resulted in about 5.5% of all students being served in public schools speaking English as a non-primary 
language (National Research Council, 1997). Kindler (2002) estimated that the number had climbed 
even higher during the 1999-2000 year with an estimated 4.4 million ELL students in public schools 
(about 9% of all students in public education). The US Census Bureau (2000) estimated that about 18% 
of children between the ages of 5 and 17 speak a language other than English as their primary language 
in the home. 

Unfortunately, the educational achievements of these ELL students have not increased as dramati- 
cally as their numbers. While there are differences between home-language groups, studies have found 
that ELL students in general are lower performing on tests of academic achievement when compared to 
their English-speaking peers (August & Hakuta, 1997; Moss & Puma, 1995). These types of outcomes 
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for ELL students, which may be related to the linguistic complexity of the items included in the test, 
have been found in other research on both mathematics and reading (Abedi, 2002; Abedi, Lord & Hof- 
stetter, 1998; Cocking & Chipman, 1988; Liu, Anderson, & Thurlow, 2000; Thomas & Collier, 1997). 
Hopstock and Stephenson (2003) found that when taking state required tests for graduation students with 
limited English proficiency were much more likely to fail than were the student group as a whole (50% 
vs. 24%). August and Hakuta (1997) also found higher dropout rates for ELL students. The need for ELL 
students to accelerate their academic achievement has received new emphasis with the implementation 
of the No Child Left Behind Act of 2001. Under this law, all children, including the specific subgroup 
ELL students, are expected to reach a proficient level in reading and math each and every year starting 
at the third grade. 

Curriculum-Based Assessment of Reading 

Currieulum-Based Measurement (CBM) as a measure of oral reading fiuency has long been shown 
to be an efficient and valid measure of academic progress for English-speaking general education and 
special education students. Lor example, in an examination of 1 1 studies looking at various measures 
of the reliability of CBM Marston (1989) found a mean reliability rating of .91. The validity of CBM 
measures has also been established through numerous studies showing a strong relationship between 
measures of oral reading fluency and a variety of standardized reading assessments (Fuchs & Deno, 
1981; Luchs & Luchs, 1999; Shinn, Good, Knutson, Tilly & Collins, 1992). It has also been shown to be 
a good measure of reading comprehension across grades (Kranzler, Miller & Jordan, 1999; Shinn et ah, 
1992) and with specific subgroups (Deno, Fuchs, Marston, & Shin, 2001; Hintze, Callahan, Matthews, 
Williams, & Tobin, 2002). 

One recent development in CBM research is to examine the relationship between oral reading flu- 
ency and student performance on state accountability tests (Deno, 2003). Several studies have reported 
moderate to high correlations between CBM and state assessments (e.g.. Good, Simmons, & Kame’enui, 
2001; McGlinchey & Hixson, 2004; Pearce & Gayle, 2009; Sibley, Biwer, & Hesch, 2001; Stage & Ja- 
cobsen, 2001). In addition, the validity of using benchmark goals or cut scores on CBM measures to pre- 
dict pass and fail rates on high-stakes assessments has also been supported (Hintze & Silberglitt, 2005). 

Despite the extensive study of CBM, and its widespread use, published research on the use of CBM 
for ELL students is limited. Baker and Good (1995) investigated the reliability and validity issues of 
CBM in English with bilingual Hispanic students. They concluded that CBM was as reliable and valid 
for Hispanic bilingual students as for their English speaking peers. The convergent and discriminant data 
from this study provided further support for CBM as a measure of English proficieney in reading and 
comprehension for bilingual students. 

In another relevant study that included Hispanic and Caucasian youth, Klein and Jimerson (2005) 
examined the potential bias of oral reading fluency as a predictor of future reading proficiency, consider- 
ing gender, ethnicity, home language, and socioeconomic status. Analyses of longitudinal data from 398 
students enrolled grades 1-3 revealed consistent intercept bias effects for the combination of ethnicity 
and home language factors at grades one, two, and three. Specifically the results indieated that, when 
using a common regression equation, oral reading fluency probes overpredicted the reading proficiency 
(as measured by the Stanford Achievement Test - Ninth Edition (SAT-9) Total Reading) of Hispanic stu- 
dents whose home language is Spanish and underpredicted the reading proficiency of Caucasian students 
whose home language is English. More recent research development in this area involves investigations 
using nonsense word fluency (NWF) to predict reading performance. Studies have found early literacy 
skills such as alphabetic understanding and phonological recoding ability measured by NWL have a sig- 
nificant predictive value on real-word reading and reading performance on standardized measures such 
as state accountability tests (e.g., Vanderwood, Linklater, & Healy, 2008 and Lien et ah, 2008). With few 
studies examining CBM among ELL students, further research is warranted. 

The ability to evaluate and predict reading ability of students can be particularly challenging with 
students whose primary language is not English. It is not uncommon for school-based professionals to 
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question the validity of CBM measures with ELL students by pointing out that reading fluency does not 
necessarily correspond to comprehension. These professinoals maintain that ELL students can at times 
decode words without having the contextual or topical knowledge needed to understand what they are 
actually saying. This may result in fluency scores that indicate mastery, while the student does not really 
understand what they have read, and thus they are likely to suffer on comprehension based assessments. 
Based upon this argument, some ELL advocates promote a portfolio or theme-based assessment system. 
For instance Sudweeks, Glissmeyer, Morrison, Wilcox, & Tanner (2004) recommended oral retellings 
to assess ELL students’ reading comprehension. On the other hand, in a study of 66 third grade students 
in Pacific Northwest, Hamilton & Shinn (2003) reported that “word callers” (students who can read flu- 
ently but do not understand what they read) scored fewer correct words per minutes and earned signifi- 
cantly lower scores on comprehension measures. Understanding that the participants of this study were 
not ELL students, it seems evident that students who read poorly and without fluency are also likely to 
comprehend poorly. 

The complexity of the issues related to language acquisition and reading fluency and comprehension 
are also related to the nature and the characteristics of the home language spoken by the student. For 
example, Spanish and English are phonetic-based languages that share many underlying cognates, or 
common word origins, the teaching of which can be used as a strategy to enhance vocabulary develop- 
ment (Carlo, August, Mclaughlin, Snow, & Dressier, 2004). Hmong, however, has distinctly different 
grammatical and phonemic usage than English and belongs to a group of languages, often referred to 
as the Miao-Yao languages, spoken in Southeast Asia and Southern China. Unlike English, Hmong is 
mostly a monosyllabic language. Moreover, it is a tonal language, meaning pitch variations are used to 
signal a difference in meaning among words. One the other hand, Somali language is a member of the 
Cushitic languages spoken mostly in Somalia and nearby Djibouti, Ethiopia, and Kenya. While this lan- 
guage has a tonal component, there is also significant overlap with the English alphabet. Both of these 
cultures have an emphasis on oral tradition. 

The Present Study 

The purpose of this study was to investigate the concurrent and predictive validity of a CBM mea- 
sure of oral reading fluency for ELL students. One objective was to provide validity evidence for CBM 
as a predictor of a state-mandated proficiency level assessment of reading for ELL students. These re- 
sults would be used to validate the ability to make inferences from ELL student’s CBM scores to reading 
in general, and also in predictions of proficiency status on a high-stakes assessment. This study focused 
on three distinct ELL populations represented by their home languages, which are Spanish, Hmong, and 
Somali. 

The findings of this study potentially add an important piece to the CBM literature regarding the use 
of oral fluency measures on students who speak a native language that is very different from the primar- 
ily phonetic-based English and Spanish languages. All three of these language groups, along with dozens 
of others, are commonly grouped together for instruction; yet, their backgrounds and instractional needs 
may be very divergent. It would be helpful to know if we can use a common method scaled on a com- 
mon metric for monitoring their progress in reading. This would also be particularly useful if the unit of 
measurement could be used as a formative assessment. CBM has been used as one of the tools that can 
provide efficient and reliable data for this purpose with English speaking students; however, it is neces- 
sary to determine if CBM is valid in this role with ELL students. 

METHOD 

Participants 

This study took place during the 2003-2004 school year in a large urban school district located in 
the Midwestern United States. Participants were fifth-grade students who had received an ELL status in 
the district and reported their home language was Spanish, Somali, or Hmong (N= 1,529). The rationale 
for not including all non-English languages was due to the fact that the selected language groups make 
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up the great majority of ELL students (around 88%). The remaining ELL students comprised 72 other 
languages and were excluded due to the small number of students in each language group. Due to mo- 
bility and absences 1,205 students (78% of possible participants) were measured on both CBM and the 
Minnesota Comprehensive Assessment (MCA). Table 1 delineates the demographic percentages in the 
targeted population and the corresponding percentages in the sample. The sample appears to be a repre- 
sentative sample and it is assumed that any loss of students was the result of random processes that did 
not relate to systematic procedures or the students actual reading ability. 

TABLE 1: Demographic Information 


Demographic Information 

Population (N = 1,529) 

Sample (N = 1,205) 

Gender 

Male 

52% 

52% 


Female 

48% 

48% 

Home 

Spanish 

48% 

46% 

Language 

Hmong 

42% 

44% 


Somali 

10% 

10% 

SES 

Eligible for 

95% 

94% 

Proxy 

Free/Reduced Lunch 




Not Eligible for 

5% 

6% 


Free/Reduced Lunch 




Materials & Procedure 

The CBM oral reading fluency measures used in this study were grade-level passages drawn from 
the district basal reading text, Invitations to Literacy which was published by Houghton Mifflin in 1999. 
Passages were selected so as to represent the subjects, authors and styles found within the curriculum. 
Readability levels and pilot studies with district students were also used to ensure that the passages were 
of similar difficulty. Completion of CBM data collection involved the students reading three different 
passages for one minute each and the calculation of the median number of words read correct for data 
analyses. All data were collected following the standardized procedures outlined in the district manual, 
“Performance Assessment of Reading in the Problem Solving Model” (Minneapolis Public Schools, 
2003). These CBM administration and scoring procedures have been described by Marston and Magnus- 
son (1985), and Shinn (1989). All participants were administered a CBM measure in September as part 
of their school’s building-wide continue progress monitoring system. 

The Minnesota Comprehensive Assessment (MCA) is a measure of reading proficiency (Minnesota 
Department of Education, n.d.) On the fifth grade test students read passages and answered multiple- 
choice and constructed response items requiring them to write short answers related to the purpose of the 
passage and the main idea. In addition, the student must be able to synthesize information in the story 
to develop conclusions and make inferences. All items were designed to be aligned with the Minnesota 
High Standards, and are scored. In order to minimize variance all items are scored by the test vendor. 

Two types of scores are derived from the MCA, a level score and a standard score. The level scores 
range from 1 to 5. A student scoring at level 1 is described as having gaps in knowledge and skills. Those 
scoring at level 3 are described as having solid on-grade level skills, while those at level 5 are considered 
to have superior performance beyond grade level. Level 3 is the level where students are considered to be 
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proficient, and corresponds to a standard score of 1420 or a raw score of 40. The raw scores on the MCA 
ranged from 0 to 58, with a mean of 45.81, a standard deviation of 10.34, and a reliability coefficient of 
.92 (Minnesota Department of Education, n.d.). The difficulty of the MCA was determined through use 
of the Degrees of Reading Power (DRP). According to MDE, the 5* grade passages used on the MCA 
have an average DRP of 54, which is the level of a typical fifth grade textbook. 

For this study, the CBM data were collected shortly after school began in the fall of 2003 as part of 
the building-wide progress monitoring system. The MCA was administered in late April. Therefore there 
was about a six-month intervening period between the CBM and MCA. 

Data Analysis 

Data analyses examined whether a CBM measure at the beginning of the school year could be 
helpful in predicting scores for ELL students toward the end of the year on a state-mandated, high- 
stakes, standards-based assessment in reading. Simple regression analyses were completed to address 
this question (Neter, Kutner, Nachtsheim, & Wasserman, 1996). Logistic regression models (Hosmer & 
Lemeshow, 1989) were used to assess the predictive validity of using CBM to estimate proficiency on 
the MCA. The MCA score of 1420 is the state-mandated cut-score for proficiency levels. Thus any score 
greater than or equal to 1420 was coded as ‘pass’ and any score less than 1420 was coded as ‘fail’. It 
should be recognized that the student does not actually fail the test but rather fails to get a score that in- 
dicates proficiency in the area of reading. This model allows for the computation of diagnostic accuracy 
statistics. The basis for this is the generation of a predicted p- value from the model (Hosmer & Lem- 
eshow, 1989; Neter, Kutner, Nachtsheim, & Wasserman, 1996). For this analysis, we utilized a p-value 
of 0.5 as the cut-point for classification. Thus, students with a predicted p-value of greater than 0.5 were 
predicted to pass, while those less than 0.5 were predicted to fail. 

Furthermore, in the analyses examined the following eight measures of diagnostic accuracy (Swets, 
Dawes & Monahan, 2000): total correct classifications, total incorrect classifications, sensitivity, speci- 
ficity, false positives, false negatives, positive predictive power and negative predictive power. The total 
correct classification is defined as the number of students who were predicted to pass and did actually 
pass plus the number of students predicted to fail who actually did fail. This sum is then divided by the 
number of students in the study. It is also obvious that one minus the sum gives a measure of the misclas- 
sification of the model. 

Sensitivity is defined as the conditional probability of getting a logistic regression p-value of greater 
than 0.5 given the student actually got a passing score. The basic interpretation is that it tells us the per- 
cent of students we predict to pass out of the subset of students who actually did pass. More directly, it 
computes the probability that the CBM scores correctly identified a student as passing from the subset of 
all students who actually did pass the MCA proficiency level in reading. This expresses how sensitive the 
scores from the CBM are at identifying students who will make the proficiency standard. 

Specificity is defined as the conditional probability of getting a logistics regression p-value of less 
than 0.5 given the student actually did fail to reach the 1420 cut-point for proficiency. The basic inter- 
pretation is that it tells us the percent of students we predicted to fail out of the subset of students who 
did actually fail. More directly, it computes the probability that a CBM score will correctly identify a 
student as not meeting the proficiency level out of the subset of students who actually did not meet the 
proficiency level. The probability expresses the ability of the CBM scores to specify those who are un- 
able to meet the proficiency level. 

There are other methods of defining sensitivity and specificity. Many researchers will reverse the 
definitions of sensitivity and specificity to highlight the ability of a procedure to be sensitive to deficits 
in some area. This would be akin to reversing our definitions and would highlight an attempt to be 
sensitive to reading problems in ELL students. Under these circumstances, one could simply reverse 
the identifications in the above definitions. This is pointed out to alleviate any potential confusion this 
might engender with respect to the literature on diagnostic accuracy measures of deficits or identifiable 
disabling conditions. 
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In addition, analyses examined the amount of error associated with the classification predictions, 
by analyzing the extent to which incorrect classifications were observed. Besides using one minus the 
correct classification, it is also possible to specify the number of false positive and false negative iden- 
tifications. False positives are defined as those students predicted to pass (p-value > 0.5), who did not 
actually pass. This is similar to a Type I error. False negatives are defined as those students predicted to 
fail (p-value < 0.5), who did actually pass. This is similar to a Type II error. 

The positive and negative predictive power gives us an estimate of how well the CBM scores pre- 
dict passing or not passing status. The positive predictive power provides the conditional probability 
that given a person is predicted to pass the reading proficiency level (p < 0.5) they actually do pass the 
proficiency level. This provides a relative likelihood of the student actually passing the proficiency level 
given the fact that they are predicted to pass based on the CBM scores. The negative predictive power 
is the conditional probability that a given student is predicted not to pass the MCA, how likely it is that 
they actually do not pass. These measures should not be confused with sensitivity and specificity which 
are conditional probabilities computed over different base groups. 

Finally, analyses of the logistic regression were considered by using a receiver operating character- 
istic (ROC) curve analysis. This provides a measure of discrimination in using the CBM scores to clas- 
sify later MCA proficiency (Hanley & McNeil, 1982; Hosmer & Lemeshow, 1989). Each of the three 
language groups were compared individually to examine whether the CBM has any differential func- 
tioning between the groups. Analyzing the area under the curve (AUC), statistic from the ROC analysis 
and using the predicted values from the logistic regression give us a measure of the ability of the CBM 
to discriminate between pairs of individuals. AUC results of greater than or equal to 0.9 are considered 
to provide outstanding discrimination, values between 0.8 and 0.9 are considered excellent, and values 
between 0.7 and 0.8 are considered acceptable (Hosmer & Lemeshow, 1989). 

RESULTS 

The results of descriptive statistics indicated that the median number of words read correctly on 
CBM was about 80 with a standard deviation of about 33. The average MCA score was about 1313 with 
a standard deviation of about 170. Given the proficiency level cut-score of 1420 on the MCA, this indi- 
cated that approximately 74% of the students did not reach the proficiency level. 

The results of the regression analysis indicate that the use of the fall CBM measure appears to be a 
significant predictor of the MCA reading score in the spring of the year, F(l,1203) = 749.79; p < 0.001; 
H = 0.39. This significant test result along with resulting in a large effect size (Cohen, 1988) indicated 
fairly strong validity evidence, suggesting a meaningful index of validity. 

In addition, we observe that both the intercept (P^ = 1064.02) and slope parameters (Pj = 3.22) from 
the model were significant (p < 0.001). The results indicate that for every single word increase in CBM 
scores in the fall, there is an expected increase in MCA reading scores of about three points. Therefore, 
from this result one would expect a score of 1420 on the MCA with a fall median CBM score of about 
111 words read correct per minute. 

The CBM score predictions of student classification with respect to the passage of the state-mandat- 
ed test were also investigated. Passage on the test indicates proficiency levels in the area of reading and 
is also utilized for Adequate Yearly Progress (AYP) calculations for schools. Therefore, it is helpful to 
know how well it predicts passing on the mandatory, high-stakes test. This investigation was completed 
through two steps. The first step was to analyze the logistics regression of MCA status (where l=pass 
and 0=not pass) on CBM scores. The second step was to analyze these results by a ROC procedure to 
identify how well CBM discriminated between those two groups. 

The logistics regression results indicated that the CBM measure was a significant predictor of profi- 
ciency status on the MCA reading (/^ = 285.833; p<0.001) and accounted for about 30% (Nagelkerke’s 
r- = 0.297) of the maximal variance. Based on the results from the logistic regression, the diagnostic 
accuracy indices were tabulated (see Table 2). 
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TABLE 2. Classification Matrix Based on the Logistic Regression 



Pass 

Predicted 

Not Pass 

Marginal 
Percent Correct 

Pass 

Actual 

166 

211 

44.0 

Not Pass 

87 

741 

89.5 

Marginal Percent Correct 

65.6 

77.8 

75.3 


By using the CBM scores, one can achieve an approximate correct classification percentage of about 
75%. This indicates that about three out of four ELL students can be correctly classified on later profi- 
ciency status based on their CBM scores from the beginning of the school year. Therefore about one out 
of four will be incorrectly classified. Overall, the CBM appears to lend validity to inferences about later 
reading proficiency outcomes as measured by a state-mandated, high-stakes test. 

To get a better understanding of the diagnostic accuracy associated with classification, the other 
measures were examined. The sensitivity of CBM to predict passing, given a student actually passes, was 
about 44%. The Specificity was about 90%. False positives were about 34% and false negatives were 
about 22%. The positive predictive power was about 66% and the negative predictive power was about 
78%. It should be noted that any changes in the cut-point for classification can and will change these 
proportions. For our analysis we used the p-value of greater than or equal to 0.5 based on the logistic re- 
gression computations. From the regression analysis the predicted cut-point for classification was about 
1 1 1 words read correct per minute. If this cut-point is changed in any manner, the associated diagnostic 
accuracy indices will also be changed. To identify the potential trade-offs in these indices the receiver 
operating characteristic (ROC) curve can be used (see Figure 1). This figure visually represents the as- 
sociated trade-offs between sensitivity and 1 -specificity related to different cut-points over the range of 
CBM scores. 

FIGURE 1. ROC Curve Indicating Tradeoff in Diagnostic Accuracy. 
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The second part of the analysis evaluated the discrimination ability of the CBM measure. For this 
a ROC analysis was completed. This method allows us to evaluate the effectiveness of the model to 
discriminate between the two different groups (those that passed and those that did not) based on the 
CBM scores, in general, and then within the different language groups (see Table 3). For this analysis the 
p-value computed from the logistic regression was utilized. The area under the curve (AUC) provides a 
useful metric about how well the model using the CBM score discriminates between students who later 
pass or do not pass the MCA reading proficiency. The closer the value is to unity, the better job it does 
at discriminating. 

The ROC analysis indicated significant results and acceptable discrimination for the overall group 
and for each group individually. These results suggest that in about 78% of all possible pairs of cases 
(AUC = 0.78) in which one student passed and another student failed, using only the CBM score from 
the Fall semester, the logistic model assigns a higher probability of passing to those who actually passed. 
This suggests that the CBM scores are a valid indicator of later reading passing status on a state man- 
dated proficiency tests. 

Next, the analysis was run on the different language groups. This will allow for the analysis of 
whether the CBM discrimination index is different between the groups. The results of this analysis in- 
dicate that the different language groups have overlapping confidence intervals with respect to the area 
under the curve (see Table 3). This implies that the model works just as well for each group and that the 
CBM is equally discriminating within all the language groups. 

TABLE 3. Results of the ROC Analyses 


Variables 

Area under 
the curve 

Standard 

Error 

Asymptotic 

Significance 

Asymptotic 95% 
Confidence Interval 
Lower Upper 

Bound Bound 

All Groups 

0.784 

0.014 

p< 0.001 

0.758 

0.811 

Spanish 

0.796 

0.023 

p< 0.001 

0.751 

0.840 

Hmong 

0.779 

0.021 

p< 0.001 

0.737 

0.821 

Somali 

0.778 

0.047 

p< 0.001 

0.686 

0.870 


DISCUSSION 

A basic tenant of data-based procedures is the necessity of using well-grounded and psychometri- 
cally sound measures to accumulate information on student standing and progress. While CBM has 
a long history as a valid and reliable measure of reading (Deno, 2003), the provisions of NCLB have 
brought renewed focus to the importance of establishing the generalization of CBM measures to students 
of all backgrounds. This study provides additional information regarding the validity of using CBM oral 
reading fluency measures in English with English Language Learners. These measures, when used at 
the beginning of the fifth grade school year, provided a good predictor of later reading status on a state- 
mandated proficiency level test. 

These results are particularly relevant because of the rapid growth in the ELL student population. 
ELL students can pose unique challenges for school staff, as they are challenged to learn about varied 
cultures, language backgrounds, levels of vocabulary development, etc. In addition, little information 
may be available on each student’s specific background and development in reading. Efficient and reli- 
able methods of early assessment are necessary for teachers and support staff to be able to direct inter- 
ventions toward those students at-risk of failing to meet state proficiency standards. This study suggests 
that the long-standing findings related to the validity of CBM measures are also applicable to ELL 
students. The findings of this study also support the notion that these results are applicable across three 
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very divergent language groups (Spanish, Hmong, and Somali). This is particular interesting because 
the limited previous research data supporting the use of CBM with ELL students has primarily focused 
upon Spanish-speaking students, but no study, as we aware, has been conducted for Hmong and Somali 
students. 

The establishment of CBM as a valid tool for the purposes of screening and progress monitoring 
with ELL students can provide a practical framework for implementation of a Problem-Solving Model or 
RTI type approach to intervention and decision making. This approach seems particularly well suited to 
ELL students, for whom the use of norm-referenced assessment measures for special education eligibil- 
ity decisions has long been in dispute. 

However, despite this strong overall predictive ability, CBM provided better classification informa- 
tion for students who did not pass the proficiency level test, while providing weak classification informa- 
tion about the later status of students who actually did pass the proficiency level. These results suggest 
that the CBM has a high level of specificity and, thus, is a good indicator of later status as failing to 
meet the proficiency level in reading on the state-mandated, proficiency test. While the ability to have 
correctly predicted the classification of those students who do not pass proficiency exams seems more 
important than correct prediction of those who eventually pass the test, it may be that the addition of 
additional variables into the process would enhance overall sensitivity. 

LIMITATIONS AND IMPLICATIONS FOR FUTURE RESEARCH 

While this study attempted to limit the variability of the sample somewhat by restricting the lan- 
guage groups studied to three languages, the background of the sample remained very diverse. The ELL 
sample in this study varied on many important factors, including the number of years they have lived in 
the U.S., their level of acculturation, formal and informal educational background, and type of English 
language support services received. Such variability may actually increase the generalizability of these 
findings, but point out the need for caution in applying these findings to any individual student. Another 
limitation of this study was that the CBM data used were collected by school staff members as part of 
their annual fall screening activities, and not as a part of a designed research study. Although the district 
has implemented a standardized data collection procedure with published reading probes and an instruc- 
tion manual based on best practices, there is no speeific proeedure in placed to monitor the fidelity of 
data collection. 

In terms of implications for future research, it would be helpful to compare the cut score points for 
the ELL sample to that of their English speaking peers and to further break down the sample by ethnicity. 
We are currently working on a larger study of differential prediction which will look at issues of both 
race and language history. This study can be extended downward to determine if the same methodology 
works for younger students who are farther from the required high-stakes assessments. It is also possible 
that this type of research will provide a method for administrators to make differential decisions based on 
costs and benefits related to different cut-off levels for classification. This study used a basic probability 
outcome level of 0.5 as the cut-off point. However, the utilization of different cut-off points and their ef- 
fects on classification can be assessed by the ROC analysis. As noted by Hintze and Silberglitt (2005) it 
may helpful to set separate cut scores for different decision purposes (e.g., screening, classification, and 
entitlement). The ROC graph provides a flexible and visual method to assess the relationship between 
the changes in sensitivity and the false negative error rate (1 - specificity). 
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